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PATENT 


Amendments to the Claims; 

This listing of claims will replace all prior versions and listings of claims in the application: 
Listing of Claims: 

1. (currently amended) A programmable processor comprising: 
a data path capable of transmitting data ; 

an external interface operable to receive data from an external source and 
communicate the received data over the data path; 

a register file containing a plurality of registers each having a register width, the 
register file coupled to the data path and operable to support processing of a plurality of threads 
and to store a plurality of data elements in partitioned fields, each of the data elements having an 
elemental width smaller than the register width ; 

an execution unit coupled to the data path, the execution unit operable to execute 
a plurality of instruction streams from the plurality of threads, each instruction stream including 
a single instruction that specifies an operation to cause multiple instances of the operation to be 
performed, each instance of the operation to be performed using a different one of , th e operation 
to b e p e rform e d on e ach on e of a the plurality of data elements in partitioned fields of at least 
one of the registers to produce a catenated result , e ach of th e data e l e m e nts having an el e m e ntal 
width small e r than th e r e gist e r width . 

2. (original) The processor of claim 1 wherein the execution unit comprises a 
pipeline having a plurality of stages and wherein the pipeline interleaves execution of 
instructions from the plurality of instruction streams. 

3. (original) The processor of claim 2 wherein the pipeline is operable to 
simultaneously contain states of execution of at least two instructions from different instruction 
streams. 
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4. (original) The processor of claim 2 wherein execution of the instructions is 
interleaved in a round-robin manner. 

5. (previously presented) The processor of claim 1 wherein the processor ensures 
only one thread from the plurality of threads can have an exception handled at any given time. 

6. (original) The processor of claim I further comprising a virtual memory 
addressing unit and a cache operable to store data communicated between the external interface 
and the data path. 

7. (previously presented) The processor of claim I wherein the execution unit is 
further operable to,, in response to decoding a second single instruction specifying a first and a 
second register each containing a plurality of operands, multiply the plurality of floating point 
operands in the first register by the plurality of operands in the second register to produce a 
plurality of products and provide the plurality of products to partitioned fields of a result register 
as a second catenated result. 

8. (currently amended) A programmable processor comprising: 
a data path capable of transmitting data ; 

an external interface operable to receive data from an external source and 
communicate the received data over the data path; 

first and second register files containing a plurality of registers each having a 
register width, the first and second register files coupled to the data path and operable to support 
processing of first and second threads, respectively , and to store a plurality of data elements in 
partitioned fields, each of the data elements having an elemental width smaller than the register 
width; 

an execution unit coupled to the data path, the execution unit operable to execute 
first and second instruction streams from the first and second threads, respectively, the first and 
second instruction streams each including a single instruction that specifies an operation to cause 
multiple instances of the operation to be performed, each instance of the operation to be 
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performed using a different one of , th e op e ration to b e p e rform e d on each on e of a the plurality 
of data elements in partitioned fields of at least one of the registers to produce a catenated result^ 
e ach of th e data e lem e nts having an e l e m e ntal width small e r than th e r e gist e r width . 

9. (original) The processor of claim 8 wherein the execution unit comprises a 
pipeline having a plurality of stages and wherein the pipeline interleaves execution of 
instructions from the first instruction stream with instructions from the second instruction stream. 

10. (original) The processor of claim 9 wherein the pipeline is operable to 
simultaneously contain states of execution of an instruction from the first instruction stream and 
an instruction from the second instruction stream. 

1 1 . (original) The processor of claim 9 wherein execution of the instructions is 
interleaved in a round-robin manner. 

12. (previously presented) The processor of claim 9 wherein the execution unit is 
further operable to, in response to decoding a second single instruction specifying a first and a 
second register each containing a plurality of operands, multiply the plurality of floating point 
operands in the first register by the plurality of operands in the second register to produce a 
plurality of products and provide the plurality of products to partitioned fields of a result register 
as a second catenated result. 

13. (currently amended) A data processing system comprising: 

(a) a bus coupling components in the data processing system; 

(b) an external memory coupled to the bus; 

(c) a programmable microprocessor coupled to the bus and capable of operation 
independent of another host processor, the microprocessor comprising: 

a data path capable of transmitting data ; 

an external interface operable to receive data from an external source and 
communicate the received data over the data path; 
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a register file containing a plurality of registers each having a register width, the 
register file coupled to the data path and operable to support processing of a plurality of threads 
and to store a plurality of data elements in partitioned fields, each of the data elements having an 
elemental width smaller than the register width ; 

an execution unit coupled to the data path, the execution unit operable to execute 
a plurality of instruction streams from the plurality of threads, each instruction stream including 
a single instruction that specifies an operation to cause multiple instances of the operation to be 
performed, each instance of the operation to be performed using a different one of 5 the op e ration 
to b e p e rformed on e ach on e of a the plurality of data elements in partitioned fields of at least 
one of the registers to produce a catenated result , each of th e data e l e ments having an e l e m e ntal 
width small e r than th e r e gist e r width . 

14. (original) The system of claim 13 wherein the execution unit comprises a 
pipeline having a plurality of stages and wherein the pipeline interleaves execution of 
instructions from the plurality of instruction streams. 

15. (original) The system of claim 14 wherein the pipeline is operable to 
simultaneously contain states of execution of at least two instructions from different instruction 
streams. 

16. (original) The system of claim 14 wherein execution of the instructions is 
interleaved in a round-robin manner. 

17. (previously presented) The system of claim 13 wherein the processor ensures 
only one thread from the plurality of threads can have an exception handled at any given time. 

18. (original) The system of claim 13 further comprising a virtual memory 
addressing unit and a cache operable to store data communicated between the external interface 
and the data path. 
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19. (previously presented) The system of claim 13 wherein the execution unit is 
further operable to, in response to decoding a second single instruction specifying a first and a 
second register each containing a plurality of operands, multiply the plurality of floating point 
operands in the first register by the plurality of operands in the second register to produce a 
plurality of products and provide the plurality of products to partitioned fields of a result register 
as a second catenated result. 

20. (currently amended) A data processing system comprising: 

(a) a bus coupling components in the data processing system; 

(b) an external memory coupled to the bus; 

(c) a programmable microprocessor coupled to the bus and capable of operation 
independent of another host processor, the microprocessor comprising: 

a data path capable of transmitting data 

an external interface operable to receive data from an external source and 
communicate the received data over the data path; 

first and second register files containing a plurality of registers each having a 
register width, the first and second register files coupled to the data path and operable to support 
processing of first and second threads, respectively , and to store a plurality of data elements in 
partitioned fields, each of the data elements having an elemental width smaller than the register 
width ; 

an execution unit coupled to the data path, the execution unit operable to execute 
first and second instruction streams from the first and second threads, respectively, the first and 
second instruction streams each including a single instruction that specifies an operation to cause 
multiple instances of the operation to be performed, each instance of the operation to be 
performed using a different one of g the op e ration to b e p e rformed on each on e of a the plurality 
of data elements in partitioned fields of at least one of the registers to produce a catenated result^ 
each of th e data e l e ments having an e lemental width smaller than th e register width . 
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21. (original) The system of claim 20 wherein the execution unit comprises a 
pipeline having a plurality of stages and wherein the pipeline interleaves execution of 
instructions from the first instruction stream with instructions from the second instruction stream. 

22. (original) The system of claim 21 wherein the pipeline is operable to 
simultaneously contain states of execution of an instruction from the first instruction stream and 
an instruction from the second instruction stream. 

23. (original) The system of claim 21 wherein execution of the instructions is 
interleaved in a round-robin manner. 

24. (previously presented) The system of claim 21 wherein the execution unit is 
further operable to, in response to decoding a second single instruction specifying a first and a 
second register each containing a plurality of operands, multiply the plurality of floating point 
operands in the first register by the plurality of operands in the second register to produce a 
plurality of products and provide the plurality of products to partitioned fields of a result register 
as a second catenated result. 
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